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Abstract. We propose a procedure for automated implicit inductive 
theorem proving for equational specifications made of rewrite rules with 
conditions and constraints. The constraints are interpreted over construc- 
tor terms (representing data values), and may express syntactic equality, 
disequality, ordering and also membership in a fixed tree language. Con- 
strained equational axioms between constructor terms are supported and 
can be used in order to specify complex data structures like sets, sorted 
lists, trees, powerlists... 

Our procedure is based on tree grammars with constraints, a formalism 
which can describe exactly the initial model of the given specification 
(when it is sufficiently complete and terminating). They are used in the 
inductive proofs first as an induction scheme for the generation of sub- 
goals at induction steps, second for checking validity and redundancy 
criteria by reduction to an emptiness problem, and third for defining 
and solving membership constraints. 

We show that the procedure is sound and refutationally complete. It 
generalizes former test set induction techniques and yields natural proofs 
for several non-trivial examples presented in the paper, these examples 
are difficult to specify and carry on automatically with related induction 
procedures. 

Keywords: Automated Inductive Theorem Proving, Rewriting, Tree 
Automata, Program Verification. 



1 Introduction 



Given a specification TZ of a program or system S made of equational Horn 
clauses, proving a property P for S generally amounts to show the validity of P 
in the minimal Herbrand model of TZ, also called initial model of TZ (inductive 
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validity). In this perspective, it is important to have automated induction the- 
orem proving procedures supporting a specification language expressive enough 
to axiomatize complex data structures like sets, sorted lists, powerlists, com- 
plete binary trees, etc. Moreover, it is also important to be able to automatically 
generate inducition sc;licnias uscid for inductive proofs in order to minimize user 
interaction. However, theories of complex data structures generate complex in- 
duction schemes, and the automation of inductive proofs is therefore difficult for 
such theories. 

It is common to assume that TZ is built with constructor function symbols 
(to construct terms representing data) and defined symbols (representing the 
operations defined on constructor terms). Assuming in addition the sufficient 
completeness of TZ (every ground (variable-frc;e) term is reducible, using the ax- 
ioms of TZ, to a constructor term) and the termination of TZ, a set of representants 
for the initial model of TZ (the model in which we want to proof the validity of 
conjectures) is the set of ground constructor terms not reducible by TZc (the 
subset of equations of TZ between terms made of constructor symbols), called 
constructor normal forms. 

In the case where the constructors are free {TZc = 0)j the set of constructor 
normal forms is simply the set of ground terms built with constructors and it is 
very easy in this case to define an induction schema. This situation is therefore 
convenient for inductive reasoning, and many inductive theorem provers require 
free constructors, termination and sufficient completeness. However, it is not 
expressive enough to define complex data structures. With rewrite rules between 
constructors, the definition of induction schema is more complex, and requires a 
finite description of the set of constructor normal-forms. Some progress has been 
done e.g. in [5] and [6] in the direction of handling specification with non-free 
constructors, with severe restrictions (see related work below). 

Tree automata (TA) with constraints, or equivalently regular tree grammars 
with constraints, have appeared to be a well suited framework for the decision 
of problems related to term rewriting (see [10] for a survey). This is the case 
for instance of ground reducibility, the property that all the ground instances 
of a given term are reducible by a given term rewriting system (TRS). This 
property was originally shown decidable for all TRS by David Plaisted [26]; 
it is reducible to the (decidable) problem of emptiness for tree automata with 
disequality constraints (see e.g. [11]). TA with constraints permit a finite repre- 
sentation of the set of constructor normal-forms when TZc is a left-linear TRS 
(set of rewrite rules without multiple occurrences of variables in their left-hand- 
sides). Indeed, on one hand TA can do linear pattern-matching, hence they can 
recognize terms which are reducible by TZc, and on the other hand, the class of 
TA languages is closed under complementation. When the axioms of TZc are not 
linear, or are constrained, some extensions of TA (or grammars) are necessary, 
with transitions able to check constraints on the term in input, see e.g. [10]. 

In this paper, we propose a framework for inductive theorem proving for the- 
ories containing constrained rewrite rules between constructor terms and con- 
ditional and constrained rewrite rules for defined functions. The key idea is a 
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strong and natural integration of tree grammars with constraints in an implicit 
induction procedure, where they are used as induction schema. Very roughly, 
our procedure starts with the automatic computation of an induction schema, 
in the form of a constrained tree grammar generating constructor normal form. 
This grammar is used later for the generation of subgoals from a conjecture C, 
by the instantiation of variables using the grammar's production rules, trigger- 
ing induction steps during the proof. All generated subgoals are either deleted, 
following some criteria, or they are reduced, using axioms or induction hypothe- 
ses, or conjectures not yet proved, providing that they are smaller than the goal 
to be proved. Reduced subgoals become then new conjectures and C becomes 
an induction hypothesis. Moreover, constrained tree grammars are used as a 
decision procedure for checking the deletion criteria during induction steps. 

Our method subsumes former test set induction procedures like [7, 2, 5], by 
reusing former theoretical works on tree automata with constraints. It is sound 
and refutationally complete (any conjecture that is not valid in the initial model 
will be disproved) when TZ is sufficiently complete and the constructor subsys- 
tem TZc is terminating. Without the above hypotheses, it still remains sound 
and refutationally complete for a restricted kind of conjectures, where all the 
variables are constrained to belong to the language of constructor normal forms. 
This restriction is expressible; in the specification language (see below). When 
the procedure fails, it impUes that the conjecture is not an inductive theorem, 
provided that TZ is strongly complete (a stronger condition for sufficient com- 
pleteness) and ground confluent. There is no requirement for termination of the 
whole set of rules TZ, unlike [7, 2] , but instead only for separate termination of 
the respective sets of rules for defined function and for the constructors. 

Moreover, if a conjecture C restricted as above is proved in a sufficiently 
complete specification TZ and TZ is further consistently extended into TZ' with 
additional axioms for specifying partial (non-constructor) functions, then the 
former proof of C remains valid in TZ' , see Section 7. 

The support of constraints permits in some cases to use the constrained 
completion technique of [23] in order to transform a non-terminating theory into 
a terminating one, by the addition of ordering constraints in constructor rules, 
see Section 5.6. It permits in particular to make proofs modulo non orientable 
axioms, without having to modify the core of our procedure. 

We shall consider a specification of ordered lists as a running example 
throughout the paper. Consider first non-stuttering lists (lists which do not con- 
tain two equal successive elements) built with the constructor symbols (empty 
list) and ins (list insertion) and following this rewrite rule: 

ins(x, ins{x,y)) ins(x,y) (cq) 

Rewrite rules can be enriched with constraints built on predicates with a 
fixed interpretation on ground constructor terms. For example, using ordering 
constraints built with >- we can specify ordered lists by the following axiom: 

ins{xi, ins{x2,y)) — *■ ins{x2, ins{xi,y)) {xi >- X2] (ci) 
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Another interesting example is the case of membership constraints of the 
form X : L where L is a fixed regular tree language (containing only terms made 
of constructor symbols). Such constraints can be useful in the context of system 
verification. Assume that we have specified a defined symbol trace characterizing 
the set of possible sequences of events of some system i.e. trace{i) reduces to true 
iff ^ is a correct list of events (represented as constructor terms) . Now, assume 
also that we have defined a regular language Bad (of ground constructor terms) 
representing lists of faulty events, by mean e.g. of a (finite) tree grammar. We can 
express in this way, for instance, that some undesirable event occurs eventually, 
or that some event is always followed (eventually) by an expected answer, or any 
kind of linear temporal property. We can express with the constrained conjecture 
trace{y) ^ true |y : Bad} that no bad list is a trace of the system. Hence, showing 
that this conjecture is an inductive consequence of the specification of the system 
amounts to do verification of trace properties (i.e. reachability properties). More 
details about this problematic, in the context of security protocol verification, 
are given in Section 3.7. 

We consider also stronger constraints which restrict constructor terms to be 
in normal form (i.e. not reducible by the axioms). Let us come back to the 
example of non-stuttering sorted lists (sorted lists without duplication), and 
add to the above rules the axioms below which define a membership predicate 
g, using the information that lists are sorted: 

a; (g ^ false (mo) 

xi (E ms(x2,2/2) true Jxi « X2] {m[) 

xi (E 2/1 ^ false lyi « ins{x2,y2),xi -< X2,yiMfj (m^) 

xi (E ins{x2,y2) ^ xi (g 2/2 1x2 -< xij (m'3) 



The constraint i/i:NF expresses the fact that this subterm is a constructor 
term in normal form, i.e. that it is a sorted list. Without this constraint, the 
specification would be inconsistent. Indeed, let us consider the ground term 
t = ms(s(0), ms(0, 0)). This term t can be reduced into both true and false, 
since ms(s(0), ms(0, 0)) is not in normal form. In Section 3, we elaborate on 
these examples on sorted lists. Using constraints of the form . : NF as above also 
permits the user to speciiy, directly in the rewrite rules, some ad- hoc reduction 
strategies for the application of rewriting. Such strategies include for instance 
several refinements of the innermost strategy which corresponds to the call by 
value computation in functional programming languages, where arguments are 
fully evaluated before the function application. 

Some non-trivial examples, including the above one, treated with our method 
are given in Section 3 (sorted lists and verification of trace properties) and Sec- 
tion 7 (powerlists). Our procedure yields very natural and readable proofs on 
tlicjsc! cixamples which are difficult (if not impossible) to specify and to carry on 
with the most of the other induction procedures. 

Related work. The principle of our procedure is close to test-set induction ap- 
proaches [7, 2]. The real novelty here is that test-sets are replaced by constrained 
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tree grammars, the latter being more precise induction schemes. Indeed, they 
provide an exact finite description of the initial model of the given specifica- 
tion, (under some assumptions like sufficient completeness and termination for 
axioms), whereas cover-sets and test-sets are over-approximative in similar cases. 

The soundness of cover-set [30] and test-set [7, 2] induction techniques do not 
require that the constructors arc free. But, in this case, cover-sets and test-sets 
are over-approximating induction schemas, in the sense that they may repre- 
sent some reducible ground terms. This may cause the failure (a result of the 
form "don't know") of the induction proof. On the other hand, the refutational 
completeness of test-set induction technique is not guaranteed in this case. 

The first author and Jouannaud [5] have used tree automata techniques to 
generalize test set induction to specifications with non-free constructors. This 
work has been generalized in [6] for membership equational logic. These ap- 
proaches, unlike the procedure presented in this paper, work by transforming the 
initial specification in order to get rid of rewrite rules for constructors. Moreover, 
the axioms for constructors are assumed to be unconstrained and unconditional 
left-linear rewrite rules, which is still too restrictive for the specification of struc- 
tures like sets or sorted lists... 

The theorem prover of ACL2 [22] is a new version of the Boyer-Moore theorem 
provcr, Nqthm. Its input language is a subset of the programming language 
Common LISP. It is a very general formalism for the specification of systems, 
and therefore permits in particular the specification of complex data structures 
mentioned above. The example of sorted lists, presented in Section 3 can be 
processed with ACL2, but the proof requires the user to add manually some 
lemmas, whereas the proof with our procedure docs not require any lemma (see 
Section 3.5). The specification language of our approach is much less expressive 
than the one of ACL2, but the intention is to minimize the interaction with the 
user during the proof process, in order to prevent the user from time consumption 
and the good level of expertise (both in the system to be verified and in the 
tlieorc;in provcr) which are often required in order to come up with the necessary 
key lemmas. An interactive proof on the same specification with SPIKE is also 
presented in Section 3. 

Kapur [20] has proposed a method (implemented in the system RRL) for 
mechanizing cover set induction if the constructors are not free. He defines par- 
ticular specifications which may include in the declaration of function symbols 
(including constructors) some applicability conditions. This handles in particu- 
lar the specification of powerlists, as illustrated by some examples. We show in 
Section 7 how our method can address similar problems. 

In [27], Sengler proposes a system INKA for automated termination analysis 
of recursively defined algorithm over data types like sets and arrays. It can handle 
constructor relations, under restrictions. When it succeeds, this method provides 
an explicit induction scheme which can be exploited with an explicit inductive 
theorem proving procedure. 

We lack a concrete base of comparison between our method and the two 
above approaches, because it was impossible for us to process our examples 
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with INKA (which is discontinued since 1997) or RRL. Let us outUne some other 
important differences between our procedure and these approaches. The above 
exphcit induction procedures are not weU suited for the refutation of false con- 
jectures. When such a system fails, it is not possible to conclude whether the 
conjecture is not valid or if the system need assistance from the user in order to 
complete the proof. On the opposite, our implicit induction procedure is refu- 
tationally complete: any false conjecture will be refuted, under the assumptions 
mentioned above. This property is of particular interest for debugging specifi- 
cations of flawed systems or programs or also for the detection of attacks on 
security protocols like in [3] (see Section 3.7). Finally, unlike explicit induction 
systems which are hierarchical, our procedure supports mutual induction. It is 
crucial for handling mutually recursive functions [2] . 

2 Preliminaries 

The reader is assumed familiar with the basic notions of term rewriting [16] and 
first-order logic. Notions and notations not defined here are standard. 

Terms and substitutions. We assume given a many sorted signature (<S, J^) 

(or simply for short) where 5 is a set of sorts and is a finite set of function 
symbols with arities. We assume moreover that the signature JF comes in two 
parts, T = C^'D where C a set of constructor symbols, and I> is a set of defined 
symbols. Let X he & family of sorted variables. We sometimes denote variables 
with sort exponent like in order to indicate that x has sort S £ S. The set 
of well-sorted terms over f (resp. constructor well-sorted terms) with variables 
in X will be denoted by T(J", X) (resp. T(C, X)). The subset of T(J^, X) (resp. 
T(C, X)) of variable-free terms, or ground terms, is denoted T(.F) (resp. T(C)). 
We assume that each sort contains a ground term. The sort of a term t e T{J^, X) 
is denoted sort{t). 

A term t is identified as usual with a function from its set of positions (strings 
of positive integers) Vos{t) to symbols of and X, where positions are strings of 
positive integers. We denote the empty string (root position) by A. The length of 
a position p is denoted \p\. The depth of a term t, denoted d{t), is the maximum 
of {|p| I p G 'Pos(t)}. The subterm of t at position p is denoted by t\p. The result 
of replacing t\p with s at position p in t is denoted by t[s\p. This notation is also 
used to indicate that s is a subterm of i, in which case p may be omitted. We 
denote the set of variables occurring in t by var{t). A term t is linear if every 
variable of var{t) occurs exactly once in t. 

A substitution is a finite mapping {xi ti, . . . , a;„ ^ tn} where xi, . . . , a;„ G 
X and ti,...tn € T{J^,X). As usual, we identify substitutions with their mor- 
phism extension to terms. A variable renaming is a substitution mapping vari- 
ables to variables. We use postfix notation for substitutions application and com- 
position. A substitution a is grounding for a term t if ta is ground. The most 
general common instance of some terms ti, . . . ,tn is denoted by mgi(ti, . . . , i„). 

Constraints and constrained terms. We assume given a constraint language 
£, which is a finite set of predicate symbols with a recursive Boolean interpre- 
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tation in the domain of ground constructor terms of T(C). Typically, £ may 
contain the syntactic equality . w . (syntactic disequality . 76 .), some (recur- 
sive) simplification ordering . . on ground constructor terms (for instance a 
lexicographic path ordering [16]), and membership . :i to a fixed tree language 
L C T(C) (like for instance the languages of well sorted terms or constructor 
terms in normal- form) . Constraints on the language C arc Boolean combinations 
of atoms of the form P{ti, . . . ,tn) where P & C and ti, . . . , i„ G T(C, X). By 
convention, an empty combination is interpreted to true. 

The application of substitutions is extended from terms to constraints in a 
straightforward way, and we may therefore define a solution for a constraint c 
as a (constructor) substitution cr grounding for all terms in c and such that ca 
is interpreted to true. The set of solutions of the constraint c is denoted sol{c). 
A constraint c is satisfiable if sol{c) (and unsatisfiable otherwise). 

A constrained term t [c] is a linear term t G T{J^, X) together with a con- 
straint c, which may share some variables with t. Note that the assumption that 
t is linear is not restrictive, since any non linearity may be expressed in the con- 
straint, for instance f{x, x) |c] is semantically equivalent to f{x, x') |c A a; « a;'] , 
where the variable x' does not occur in c. 

Constrained clauses. A literal is an equation s = f or a disequation s 7^ t or 
an oriented equation s ^ t between two terms. A constrained clause C |c] is a 
disjunction C of literals together with a constraint c. A constrained clause C [c] 
is said to subsume a constrained clause C" |c'] if there is a substitution a such 
that Ccr is a sub-clause of C and c' A -ica is unsatisfiable. 

A tautology is a constrained clause 8\ = V . . . V s„ = t„ |d] such that d is 

a conjunction of equational constraints, d = ui « ui A . . . A Ufc w iife and there 
exists i G [l..n\ such that Sia = tiU where a is the mgu of d. 

Orderings. A reduction ordering is a well-founded ordering on T{!F, X) mono- 
tonic wrt contexts and substitutions. A simplification ordering is a reduction 
ordering which moreover contains the strict subterm ordering. We assume from 
now on given a simplification ordering > total on T[!F), defined, e.g., on the top 
of a precedence as an Ipo X;po [16]. 

The multiset extension of an ordering > is defined as the smallest 

ordering relation on multisets such that M U {t} M U {si, . . . , Sn} \it> Si 

for all i G [l..n] . The extension >e of the ordering > on terms to literals is defined 
as the multiset extension >™'"' to the multisets containing the term arguments 
of the literals. The extension of the ordering > on terms to clauses is the multiset 
extension >^"' applied to the multiset of literals. 

Constrained rewriting. A conditional constrained rewrite rule is a constrained 

clause of the form F => / ^ r Jc] such that r' is a conjunction of equations, called 
the condition of the rule, the terms / and r (called resp. left- and right-hand side) 
are linear and have the same sort, and c is a constraint. When the condition F 
is empty, it is called a constrained rewrite rule. A set of conditional constrained, 
resp. constrained, rules is called a conditional constrained (resp. constrained) 
rewrite system. 
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Let 7^ be a conditional constrained rewrite system. The relation s fdj rewrites 
to t |fi] by TZ, denoted s Jd] ifdj, is defined recursively by the existence of 
a rule p = T f ^ r Jc] E TZ, a, position p e Vos{s), and a substitution a 
such that s\p = £a, t\p = ra, da A -^cu is unsatisfiable, and ua [n va for all 
u = V G r. The transitive and reflexive transitive closures, of are denoted 
and and ulnv stands for 3w, u w v. 

Note the semantical difference between conditions and constraints in rewrite 
rules. The validity of the condition is defined wrt the system TZ whereas the 
interpretation of constraint is fixed and independent from TZ. 

A constrained term s |c] is reducible by TZ if there is some t |c] such that 
s [c] t |c] . Otherwise s |c] is called irreducible, or an 7?,-normal form. A 
substitution a is irreducible by TZ if its image contains only 7^- normal forms. 
A constrained term t |c] is ground reducible (resp. ground irreducible) if ta is 
reducible (resp. irreducible) for every irreducible solution a of c grounding for t. 

The system TZ is terminating if there is no infinite sequence ti t2 ■ ■ ■, 
TZ is ground confluent if for any ground terms u,v,w € T{!F), v u w, 
implies that v i-ji w, and TZ is ground convergent if TZ is both ground confluent 
and terminating. The depth of a non-empty set TZ of rules, denoted d{TZ), is the 
maximum of the depths of the left-hand sides of rules in TZ. 

Constructor specifications. We assume from now on given a conditional con- 
strained rewrite system TZ. The subset of TZ containing only function symbols 
from C is denoted TZc and TZ \ TZc is denoted TZx). 

Inductive theorems. A clause C is a deductive theorem of TZ (denoted TZ\= C) 
if it is valid in any model of TZ. A clause C is an inductive theorem of TZ (denoted 
\=ind C) iff for all for all substitution a grounding for C, TZ ^ Ccr. 

We shall need below to generalize the definition of inductive theorems to 
constrained clauses as follows: a constrained clause C [c] is an inductive theorem 
of TZ (denoted TZ \=ind C |c]) if for all substitutions a G sol{c) grounding for C 

we have TZ \= Cu. 

Completeness. A function symbol f G V is sufficiently complete wrt TZ iff for 
all ti, ... ,tn G '?"(C), there exists t in T(C) such that f{ti, . . . , t„) t. We say 

that the system TZ is sufficiently complete iff every defined operator / e I? is 
sufficiently complete wrt TZ. Let / G P be a function symbol and let: 

{a => f{tl, . . . ,4) - r-1 [cil, . . . ,r„ ^ /(t^ ...,tl)^rn [c„l} 

be a maximal subset of rules of TZ-d whose left-hand sides are identical up 
to variable renamings /xi,...,/x„, i.e. f{tl,...,tl)iJ,i = f{t\,...,t1)ijL2 = 
. . . f{t", . . . ,t'^)lJLn- We say that / is strongly complete wrt TZ (see [2]) if / is 
sufficiently complete wrt TZ and TZ \=ind Ami [ciA^il V ... V Fn^n Ic„^„] for 
every subset of TZ as above. The system TZ is said strongly complete if every 
function symbol / G is strongly complete wrt TZ. 
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3 Sorted Lists and Verification of Trace Properties 



In this section, we present some examples for motivating the techniques in- 
troduced in this paper. These examples illustrate the fact that our approach 

supports constraints in the axioms (both for constructor and defined fimctions) 
and the conjectures. Note that constrained rules are not supported by test set 
induction procedures. 

3.1 Constructor Specification, Normal Form Grammar 

Consider a signature with sort S = {Bool, Nat, Set}, and constructor symbols: 

C = {true, false : Bool, : Nat, s : Nat Nat, : Set, ins : Nat x Set Set} 

and a constructor rewrite system for ordered lists without duplication: 

^ _ ( ins{xi,ins{x2,y)) ^ ins{x2,y)lxi X2j 1 
~ \ ins{xi,ins{x2,y)) — *■ ins{x2, ins{xi,y)) fxi y X2j } 

Note the presence of constraints in these rewrite rules. The cqiiality constraint in 
the first rule permits the elimination of (successive) redundancies in lists, and the 
ordering constraint in the second rule ensures that the application of this rule will 
sort the lists. Note that the first rule actually corresponds to the unconstrained 
rewrite rule: ins{x,ins{x,y)) — » ins{x,y). As outlined in introduction, this rule 
cannot be handled by the procedures of [5,6], because it is not not left-linear. 

Constrained grammar are presented formally in Section 4. In this section, we 
shall only give a taste of this formalism and how their are used in the automatic 
inductive proof of conjectures. 

The set of ground 7?.c-normal forms is described by the following set of patterns: 

NF{Tlc) = {x : Bool} U {x : Nat} U {0} U {ins(x, 0) | x : Nat} 

U {ins{xi, ins{x2,y)) \ xi,X2 : Nat, ins{x2,y) G NF(7^c),a;i ^ 2:2} 

We build a constrained grammar Gnf{T^c) which generates NF(7?.c) by means of 
non-terminal replacement guided by some production rules. The four first sub- 
sets of NF(7?.c) are generated by a tree grammar from the four non-terminals: 
{^x^°°\ L^^j^', L^;^^*, ^ins{x,y)_^} and using the production rules (the non ter- 
minals are considered below modulo variable renaming): 

^0;?°°' := true ^x^°°^ := false 

^a;N^t:=0 ^x^^^ ■= 3(^x2^^') 

^x^^' := ^ins{x,y)^ := ins{^x'^^^, l^^j^*) 

For the last subset of NF(7?.c), we need to apply the negation of the constraint 
Xi X2V Xi >- X2 in the production rules of the grammar. For this purpose, we 
add the production rule: 

jns{x, y)_, := ins{ ^x'^^\ jns{x2, y2)J Ix^^^ ■< X2I 

Note that the variables in the non terminal ^ms{x2, 2/2 )j in the right member of 
the above production rule have been renamed in order to be distinguished from 
the variables in the non terminal in the left member. 
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3.2 Defined Symbols and Conjectures 

We complete the above signature with the set of defined function symbols: 

V = {sorted : Set Bool, e, Nat x Set -> Bool} 
and the conditional constrained TRS TZ-d containing the following rules: 

sorted{%) — > true (sq) 

sorted{ins{x,^)) ^ true (si) 
sorted{ins{x\, ins{x2,y))) — > sorted{ins{x2,y)) Ixi -< X2] (S2) 

Note that there is no axiom for the case |xi >z X2]. The defined function sorted 
is nevertheless sufRciently complete wrt TZ. We can show with an induction (on 
the size of the term) that every term t of the form sorted{ins{ti, ins{t2,i))) can 
be reduced to a constructor term. If ti ^ t2, then (S2) applies and the term 
obtained is smaller than t. If ti >z t2, then t is reducible by TZc into the smaller 
sorted {ins {t2,£))) if t\ « t2 or into sorted {ins {t2, ins{ti,£))) if ti y t2, and this 
latter term is furthermore reduced by the rule (S2) ofTZ-p into sorted{ins{ti, £)) . 

The rules (mo-rrig) implements a membership test restricted to ordered lists. 
The function e specified below another variant of a membership test on lists. 



a; e ^ false (mo) 

xi G ins{x2, y) —<■ true fxi « X2} (mi) 

xi e ins{x2,y) ^ xi Gylxi !^ X2j (m2) 

X (E ^ false (m^) 

xi (E ins{x2,y2) — > true |xi « X2] (m'^) 

xi <£ j/i -> /a/se |xi X2,2/i:^ms(x2,2/2)_,l {m'2) 

xi g ms(x2, j/2) ^ Xi (E y2 1x2 xi] (m^) 



Like sorted, the defined functions G and <s are sufficiently complete wrt TZ. 

The above version of the rule (m2) is the formal one (the version in introduc- 
tion was given in a simplified notation). Note the presence of the membership 
constraint yi\ins{x2,y2) _^ in (^2)- refers to the above normal form grammar 
^nfC'T^c) and hence restricts the variable y\ to be a constructor term headed by 
ins and in normal form. 

One may wonder why we added this membership constrained and why a rule 
(m2) of the form xi (e ins{x2,y2) — * falselxi -< X2] would not be satisfying. 
The reason is that with the rule (mj) instead of (mJ,), the specification is not 
consistent. Indeed, let us consider the ground term t = (s ins{s{Q), ms(O,0)). 
Note that t is not in normal form. It can be rewritten on one hand into (s 
OT,s(0, m,s(,s(0), 0)) by TZc, which is in turn rewritten into true using {vr\[). On 
the other hand, t can be rewritten into false by (m2). This second rewriting is 
not possible with (m^), because of the membership constraint in this rule. 
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Another idea to overcome this problem should be to add a condition as in: 

sorted{y) = tme xi (e ins(x2,y) false |xi -< X2I (^2") 

The specification with {m'.^') is inconsistent as well since the term t is rewritten by 
TZc into sorted{ins(0, ins(s[0), 0)), which is rewritten into true by TZd- Therefore, 
the addition of the membership constraint in rule (rrig) is necessary for the 
specification of d. 

Let us consider the two following conjectures that we are willing to prove by 
induction: 

sorted{y) = true (1) 
X y = X G y (2) 

3.3 Test Set Induction 

Roughly, the principle of a proof by test set induction [7, 2] is the one presented 
in introduction except that: 

1. the induction scheme is a test set (a finite set of terms). 

2. variables in the goals are instantiated by terms from the test set. 

Moreover, the instantiation in 2 can be restricted to so called induction variables 
(sec [2] ) , which are the variables occurring (in a term of a goal) at a non- variable 
and non-root position of some left-hand sides of rules of TZj}. 

Let us try to prove (1) using the test set induction technique. A test set^ for 
TZ (and sort Set) has to contain: 

TS{Set,TZ) — {0, ms(a;i,0), ins{xi,ins{x2,y))} 

We start by replacing y in (1) by the terms from the test set TS {Set, TZ), 
and obtain: 

soHed{$) = true (3) 

sorted {ins (x I, ^)) ~ true (4) 
sorted{ins{xi, ins{x2,y))) — true (5) 

Subgoals (3) and (4) arc simplified by TZ-p (respectively with rules (sq) 
and (si)) into true = true which is a tautology. Subgoal (5) cannot be sim- 
plified by TZv, because of the constraints in rewrite rules. Subgoal (5) does not 
contain any induction variable, and therefore, it cannot be further instantiated. 
So, the proof stops without a conclusion. Hence, we fail to prove Conjecture (1) 
with test set induction technique. 

Concerning Conjecture (2), the specification of the rules for g contains mem- 
bership constraints. This kind of specification is not supported by the current 
test-set induction procedures. 

^ This test set is an over approximating description of the set of constructor terms in 
normal form. For instance, the term ins{s{0), ins{0,$)) is an instance of the third 
element of the test set but it is not in normal form. 
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3.4 Constrained Grammars based Induction 

As discussed above, we need to add appropriate constraints while instantiating 
the induction goals. This is precisely what constrained tree grammars do. 

Our procedure, presented in Section 5, roughly works as follows: given a 
conjecture C we try to apply the production rules of the normal form grammar 
to C (instead of instantiating by terms of a test set) as long as the depth of the 
clauses obtained is smaller or equal to the maximal depth of a left-hand-side of 
IZ-D- All clauses obtained must be reducible by TZ, or by induction hypotheses or 
either by others conjectures not yet proved and smaller than C. If this succeeds, 
the clauses obtained after simplification are considered as new subgoals and for 
their proof we can use C as an induction hypothesis. Otherwise, the procedure 
fails and we have established a disproof under some assumptions on TZ. 

In order to prove Conjecture (1), we constraint the variable y of this clause 
to belong to one of the languages defined by non-terminals (of a compatible 
sort) of the normal form grammar Q^-piTZc)- This is not restrictive since IZc is 



terminating and is sufficiently complete. 

sorted{y) = true fy: ^x^^^ (l-a) 

sorted{y) = truely: jns{xi,yi)J (l.b) 

Let us apply the above principle to the proof of Conjecture (1). The appli- 
cation of the production rules of the grammar to (l.a) and (l.b) returns: 

sorted{9) = true (3') 

sorted{ins{xi,(l>)) = true [xi: ^2;^^*] (4') 

sorted{ins{xi, ins{x2,9))) = truelxi,X2- ^x'^^*,xi -< X2I (5') 

sorted{ins{xi,ins{x2,y2))) = true (5") 



lxi,X2,X3:^x'^^^,y2: jns{x3,y3)^,xi ~< X2,X2 -< X3} 

For obtaining (4'), (5') and (5"), several steps of application of the production 
rules of the grammar are necessary. Subgoals (3'), (4') are simplified by TZt> 
into a tautology, like in Section 3.3. Unlike Section 3.3, Subgoal (5') can now be 
simplified using the rule ($2) of K-d- because of its constraint Xi -< X2- Moreover, 
Subgoal (5") can be reduced by the rule ($2) into: 

sorted{ins{x2,y2)) = true lx2,X3: ^x'^^^,y2: jns{x3,y3)_^,X2 -< X3I 

This latter subgoal can be itself simplified into true = true by (1), used here as 
an induction hypothesis. This terminates the inductive proof of (1). 

For the proof of Conjecture (2), the situation is more complicated. The dec- 
oration of the variables of (2) with non terminals of the grammar t/NF('^c) 
returns: 

X (^y = x Gylx:^x':i^\y:^x^l (2.a) 
x<S:y = xGylx: ^x^^^, y: jns{xi,yi) J (2.b) 
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The application of the production rules of Gnf{TIc ) to these clauses gives: 

x (E = x G (6) 

X (E ms(xi,0) = x € ms(a;i,0) lx,xi: ^x'^^^ (7) 

X g ins{xi, ins{x2, 0)) = a; e ins{x\, ins{x2, 0)) [a;, xi,X2: l^j^*, a;i -< 0:2] (8) 
X (E ms(a;i, ins{x2,y2)) = x G ins{xi, ins{x2,y2)) 

Ix,xi,x2:^x'^^*,y2:jns{x3,y3)^,xi -< a;2, a;2 ^ 2:3] (9) 

The clause (6) is reduced, using (rriQ) and (mo), to the tautology /aZse = false. 

In order to simplify (7), we restrict to the cases corresponding to the con- 
straints of the rules {m'l), (m2) and (trig). This technique, called Rewrite Splitting, 
is defined formally in Section 5. We obtain respectively: 

true = X G ms(a;i, 0) Ixi: ^x'^^*- , x » xi} (7.1) 
false = X G ms(a;i, 0) Ixii^x'^^^, ms(a;i, 0): ^ ms (a;2, 2/2) _,, x -< X2} (7.2) 
a; (E = a; e ins{xi,%) lx\\^x^^^, x\ -< x\ (7.3) 

Note that the constraint in (7.2) implies that xi = X2- AH these subgoal are 
reduced into tautologies true = true or false = false using respectively the 
following rules of 71t>'- 

- (mi) for (7.1), 

- (m2) and (mo) for (7.2) (with Xi — X2), 

- (mg) for the left member of (7.3), and (m2) then (mo) for its right member. 

The subgoal (8) is also treated by Rewrite Splitting with the rules (m'^), (mj), 
(mg) of TZ-D, similarly as above. 

Let us now finish the proof of Conjecture (2), with the subgoal (9). By rewrite 
splitting with the rules (m'l), (mj), (mg), we obtain: 

true = X G ins{xi, ins{x2,y2)) 

lx,xi,X2,X3:^x^^^,y2:jns{x3,y3)^,xi -< a;2, a;2 ^ ars, a; w xi] (9.1) 

false = X G ins{xi, ins{x2; 1/2)) 

a;,a;i,a;2,a;3,a;4:La;j'",2/2:L«'^'S(a;3,2/3)j,a;i < X2,X2 -< X3, ,g 2) 
ins{xi,ins{x2,y2))\iTis{x4,,y4,)^,x ^ X4 ^ ' ' 

X (E ins{x2,y2) = x G ins{xi, ins{x2,y2)) 

lx,xi,X2,X3:^x^^^,y2: jns{x3,y3)^,xi -< a;2, a;2 ^ ars, a;i -< xj (9.3) 

The subgoal (9.1) is simplified by (mi) into the tautology true = true. 
The subgoal (9.3) is simplified by (m2) into: 

X g ins{x2,y2) = xG ins{x2,y2) lx,X2,x-i: ^x^^^,y2: Jns{x3,y3)_^,X2 -< 3:3] 

(10) 
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At this point, we are allowed to use the goal (2) as an induction hypothesis 
since we have perform a reduction step on the subgoals. A simplification of (10) 
using (2) gives the tautology: 

X (E ins{x2,y2) = a; g ins{x2,y2) lx,X2,X3:^x^^^,y2: jns{x3,y3)^,X2 -< x^j 

For the subgoal (9.2), note that in the constraints, x -< X4 implies x -< 
Xi. Hence (9.2) can be simplified by (m2) into: false = x E ins{x2,y2) I- ■ 
A simplification of the above subgoal using (2) (as an induction hypothesis) 
gives: false = x ins{x2, 2/2) I- • •!• The above subgoal has the same constraints 
as (9.2), and it can be observed that this constraint implies x ^ X2- Therefore, 
we can simplify this subgoal using (rrij) into the tautology false = false. 

In conclusion, Conjecture (2) can be proved with our approach based on 
constrained grammars without the addition of any lemmas. 

3.5 Proof v^rith ACL2 

A proof of Conjecture (2) was done by Jared Davis^ with the ACL2 theorem 
prover, using his library osets for finite set theory [15]. In this library, sets are 
implemented on fully ordered lists (wrt an ordering <<). The definition in osets 
of a function insert a X, for insertion of an element a to a list X is the same as 
the above axioms of TZc ■ 

(defun insert (a X) 

(declare (xargs : guard (setp X))) 
(cond ((empty X) (list a)) 

((equal (head X) a) X) 

((« a (head X)) (cons a X)) 

(t (cons (head X) (insert a (tail X)))))) 

It refers to the functions head and tail which return respectively the first (small- 
est) element in list (the LISP car) and the rest of a hst (LISP cdr). The guard 
(setp X) ensures that X is a fully ordered list without duplication. 

The library osets contains a definition of membership similar to the axioms 
of (mo-m2) of TZj) for the definition of €: 

(defun in (a X) 

(declare (xargs : guard (setp X))) 
(and (not (empty X)) 

(or (equal a (head X)) 
(in a (tail X))))) 

Next, our defined function g becomes the following inb: 

(defun inb (a X) 
(declare (xargs : guard (setp X))) 

^ Jared Davis, personal communication. 
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(and (not (empty X) ) 

(not (and (setp X) (« a (head X)))) 
(or (equal a (head X)) 
(inb a (tail X))))) 

The conjecture (2) becomes: 

(defthm in-is-inb 
(equal (in a X) 

(inb a X))) 

Using the osets library, the system proved everything except the following 
subgoal: 

(IMPLIES (AND (NOT (EMPTY X)) 
(SETP X) 

(« A (HEAD X))) 
(EQUAL (IN A X) (INB A X))). 

The following lemma permits to finish the proof: 

(defthm head-minimal 

(implies (<< a (head X)) 

(not (in a X))) 
: hints (("Goal" 

:in-theory (enable primitive-order-theory)))) 

The lemma head-minimal was not available to users of the library osets. It 
will be incorporated (together with the technical lemma for its proof) in the 
appropriate file of the osets library. 

(local (defthm lemma 

(implies (and (not (empty X)) 
(not (equal a (head X))) 
(not (« a (head (tail X)))) 
(« a (head X))) 
(not (in a X))) 
:hints(("Goal" 

:in-theory (enable primitive-order-theory) 
: cases ((empty (tail X))))))) 

Note that this proof uses several theorems and hints included in the osets 
library. Without this library, the ACL2 theorem prover would need the addition 
of several key lemmas and hints. For finding them, the user would be required 
both experience and a good understanding of the problem and how to solve it. 

3.6 Assisted Proof with SPIKE 

Conjecture (2) was proved with the last version of SPIKE by Sorin Stratulat^ 
® Sorin Stratulat, personal communication. 



15 



Since SPIKE does not support constrained axioms, constraints are expressed 
as conditions. The specification of sorted becomes: 

sorted(Nil) = true; 

sorted(ins(x, Nil)) = true; 

xl <= x2 = true => sorted(ins(xl, ins(x2, y))) = sorted(ins(x2, y)); 
xl <= x2 = false => sorted(ins(xl, ins(x2, y))) = false; 

The axioms for e and <s are respectively: 

in(xl, Nil) = false; 

xl = x2 => in(xl, ins(x2, y)) = true; 

xl <> x2 => in(xl, ins(x2, y)) = in(xl, y) ; 

and 

in' (xl, Nil) = false; 

xl = x2 => in'(xl, ins(x2, y)) = true; 

x2 < xl = true => in'(xl, ins(x2, y)) = in'(xl, y) ; 

xl < x2 = true, osetp(ins(x2,y)) = true => in'(xl, ins(x2, y)) = false; 

xl < x2 = true, osetp(ins(x2,y)) = false => in'(xl, ins(x2, y)) = in'(xl, y) ; 

The unary predicate osetp characterizes ordered lists. It is defined by the fol- 
lowing axioms. 

osetp (Nil) = true; 

osetp (ins (x. Nil)) = true; 

osetp (ins (x, ins(y, z))) = and(x < y, osetp (ins (y, z))); 

With this predicate, the conjecture is expressed as follows. 

osetp(y) = true => in(x, y) = in'(x, y) ; 

A particular user specified strategy and the following additional lemmas were 
necessary for the termination of the proof with SPIKE. The three first lemma 
are natural, the last one is less intuitive. 

osetp(y) = true => sorted (y) = true; 

osetp(ins(ul, u2)) = true => osetp(u2) = true; 

ul < u2 = true, u2 < u3 = true => ul < u3 = true; 

osetp (ins (u4, u5)) = true , u2 < u4 = true => in(u2, u5) = false; 

3.7 Verification of Trace Properties 

We have seen in the previous sections how membership constraints can be used 
in the axioms of IZ for the specification of operations on complex data structures, 
and how our method can handle it. Our procedure can also handle membership 
constraints in the conjecture. This feature can be used for instance in order to 
restrict some terms to a particular pattern. It is very useful in the context of the 
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verification of infinite systems, in order to express that a trace of events belongs 
to a (regular) set of bad traces. 

In [3] we follow this approach for the verification of security properties of 
cryptographic protocols, using an adaptation of the procedure of this paper in 
order to deal with specifications which are not necessarily confluent and suffi- 
ciently complete. In this section we wont describe in full details the specification 
of [3] but we shall roughly describe the main lines of the approach. Consider the 
following conjecture: 

traceiy) ^ true [y: ,x^"^, y: .x^^^j 

Here, the membership constraint y: ^x^'^^ restricts y to be generated by the non 
terminal l^^j'^* of the normal form constrained tree grammar. It means that y 
is a constructor term in normal form (as in the above example of sorted lists) 
representing a list of events of a system. The second membership constraint 
y: ^x^^^ further restricts y to belong to a regular tree language representing 
faulty traces (traces which lead to a state of the system corresponding to a 
failure, an attack for instance). Finally the clause trace{y) ^ true expresses that 
y is not a trace of the system. Hence the above conjecture (11) means that every 
bad trace is not reachable. 

The defined function trace can be specified using constrained conditional 
rewrite rules. For instance, in [3], we follow the approach of Paulson [25] for the 
inductive specification of tlw^ messages exchanges of the protocol, and of the ac- 
tions of the insecure communication environment. Note also that we extends this 
model with equations specifying the cryptographic operations, like the following 
non-lcft-lincar equation for the decryption operator dec in a symmetric cryp- 
tosystem: dec(^enc{x,y),y) x. These axioms, sometimes referred as explicit 
destructors equations, permit a strict extension of the verification model (they 
allow strictly more attacks on protocols) and they are specified as constructor 
equations of TZc in our model. 

4 Constrained Tree Grammars 

Constrained tree grammars have been introduced in [9] , in the context of auto- 
mated induction. The idea of using such formalism for induction theorem proving 
is also in e.g. [5, 12], because it is known that they can generate the languages 
of normal-forms for arbitrary term rewriting systems. 

In this paper, we push the idea one step beyond with a full integration of 
tree grammars with constraints in our induction procedure. Indeed, constrained 
tree grammars arc used here: 

i. as an induction scheme (instead of test-sets), for triggering induction steps 
by instantiation of subgoals using production rules, 

ii. as a decision procedure for checking deletion criteria, including tests like 
ground irreducibility or validity in restricted cases, as long as emptiness is 
decidable. 
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iii. for the definition and treatment of constraints of membership in fixed tree 
languages, in particular languages of normal forms. 

We present in this section the definitions and results suited to our purpose. 

Definition 1. A constrained grammar Q = {Q,A) is given by: 1. a finite set Q 

of non-terminals of the form ^Uj, where u is a linear term ofT(J^, X), 2. a finite 
set A of production rules of the form ^v^ := /( ^itij, . . . , l^hj) W where / G JF, 
L^j, L^ij)- • ■ ! L^nj € Q (niodulo variable renaming) and c is a constraint. 

The non-terminals are always considered modulo variable renaming. In particu- 
lar, wc assume wlog (for technical convenience) that the above term f{ui, . . . , u„) 
is linear and that var{v) D var{f{ui, . . . , = 0. 

4.1 Languages of Terms 

We associate to a given constrained grammar Q = {Q, A) a finite set of new unary 
predicates of constraint of the form .:^Uj, where ^Uj £ Q (modulo variable 
renaming). Constraints of the form t: ^Uj called membership constraints and 
their interpretation is given below. The production relation between constrained 
terms \-g is defined by: 

t[y] ly- L.Vj Adj \-l t[f{yi, y„)] lyi:^ui_, A . . . A ^Un_, Ac A drj 

if there exists l^j := /( l^Ij) • • • ) l^u) [c] € A such that /(«!,...,«„) = 
VT, and 2/1,. . . ,yn are fresh variables. The variable y, constrained to be in the 
language defined by the non-terminal l^^j is replaced by /(?/i, . . . , y„) where the 
variables yi,. ■ . ,yn are constrained to the respective languages of non-terminals 
lUij, . . . , t_Unj- The union of the relations for all y is denoted \-g and the 
reflexive transitive and transitive closures of the relation hg are respectively 
denoted by \-g and {Q may be omitted). 

Definition 2. The language L{Q, ^u^) is the set of ground terms t generated by 
a constrained grammar Q from a non-terminal ^u_^, i.e. such that y\_y:i_u^ h* 

t Jc] where c is satisfiable. 

Given Q' C Q, we write L{g,Q') = \J^^^^Q,L{g, ^uj and L{g) = L{g,Q). 
Given a constrained grammar g = {Q, A), we can now define sol{t: ^u_,), where 
lWj G Q, as {a \ta G L{g, l^^j)}- 

Example 1. With the normal grammar of Section 3.4, denoted g in this exam- 
ple, we have: L{g, ^x^°°^) = {true, false}, L{g, ^x^^*) = {0,s"(0) | n > 0}, 
L{g,^x^f') = {0}, L{g, jns{xi,X2)J = {ms(s"i(0),zns(...,zns(s"''(0)))) | 
k > l,ni < . . . < rifc}, O 

Note that every regular tree language L can be generated by a constrained 
tree grammar following Definitions 1 and 2, with production rules of the form: 
^Xj := /( ^Xj^, . . . , ^Xj^) where Si,. . . , Sn, S are new sorts representing the 
non terminals of a regular tree grammar generating L. 
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The intersection between the language generated by a constrained tree gram- 
mar (in some non-terminal) and a regular tree language is generated by a con- 
strained tree grammar. The constrained grammar for the intersection is built 
with a product construction. 

4.2 Languages of Normal Forms 

The constrained grammar Qnf{T^c) = {Qnf{T^c), ^nf{T^c)) defined in Figure 1 
generates the language of ground T^c-normal forms. Its construction is a general- 
ization of the one of [11]. Intuitively, it corresponds to the complementation and 
completion of a grammar for 7?.c-reducible terms (such a grammar does mainly 
pattern matching of left members of rewrite rules), where every subset of states 
(for the complementation) is represented by the most general common instance 
of its elements (if they are unifiable). For purpose of the the construction of 
Qnf{T^c), a new sort Red is added to <S, (the sort of reducible terms), and hence 
also a new variable a;'^^''. An example of a constrained grammar for 7?,c-normal 



„ , _ J « I M is a strict subterm of I for some Z — » r |c| e TZc \ 
^ ' !_ or u is a subterm of Hf c is empty J 

subset oi Z.,(7cc) s.t. ri, . . . , In are umnable J ' 

^nf('7?-c) contains: 

every ^x^'^'^ :— /( ^uij, . . . , ^Unj) | ] such that one of the Ui at least is x^"'', 
every ^a;?'' := /( ^Mi.,, . . . , ^Un^) [c| and every := f{ . . . , ^Unj) {-^4 
such that f € !F with profile Si, . . . ,Sn — > 5 

and L^iji ■ • • I L^nj £ Qnf(72.c), ^^i, have respective sorts Si, . . . ,Sn 

t = mgiju I £ (3nf(72.c) and u matches f{ui, . . . ,u„)} 

c= Y ee 

Figure 1: Constrained grammar ^nf('7^c) for 7?,c-normal forms 



forms constructed this way was given in Section 3.1. 

Lemma 1. For every term t € T(C), t € I'(^nf('7^c), l^j) for some ^u_^ g 
(5nf(7?.c) \ {l^^j^^} iffi is an TZc -normal form. 

Proof. We shall use the following Fact, which can be proved by a straightforward 
induction on the length of the derivation y ly: ^u_,l h* t |[c]. 

Fact 1 For each ^u_, e Qnf{T^c) \ {l^^j*''}, and each t e L{Q-^f{TZc), ^u_,), 
t is an instance of u and u = mgijw | S Qnf{T^c) \ 

{i_x^^'^} and t is an instance o/u}. 

Let us now show the 'only if direction by induction on the length of the 
derivation y fy: ^u_,'l h* t |c'] (where c' is satisfiable). 



19 



If the length is 1, then t is a nullary symbol of C, and by construction t is 
7?.C-irreducible. 

If yl2/:L«J ^ f{yi,...,yn)lyi:^Ui0,A...Ayn:^Une,Ac9l h* i |c'l = 
f{ti,...,tn)lc'j for some production rule /( l^i j, ■ • ■ , L^nj) [c] S 

'4nf('^c) is a variable renaming by fresh variables), then for every i G 
ti e L{0^f{TZc), L^ij)) S'lid L^ij 7^ L^;?^*^ (otherwise we would have = 
^a^j^^). Hence, by induction hypothesis, every ti is a 7?.c-normal form. Assume 
that t is T^c-reducible (it must then be reducible at root position), and let 
Z ^ rfd] G TZc be such that t = It, t £ sol{d) and I is maximum wrt sub- 
sumption among the rules of IZc satisfying these conditions. By construction, 
u = I and c = -idcr Ac'. It follows from the satisfiability of c' that r € sol{c) (the 
variables of c are instantiated by ground terms in the above grammar derivation). 
This is in contradiction with c = -irfcr A c' and r e sol{d). 

We show now the 'if direction by induction on t. 

If Hs a nullary function symbol of sort S and is 7?-c-irreducible, then t is not 
the left-hand side of a rule of TZc, and yly-,^x^l \- t. 

If t = f{ti. . . . ,tn) and is T^-c-irrcduciblc, then every ti is 7?,c-irreducible 
for i e [l-.n], hence by induction hypothesis, ti € L{Gnf{TZc), l^ij) for some 
L^ij G Qnf{TZc) \ {lS^j^''}- It means that for all i € [l..n], there is a deriva- 
tion of Gnf(JIc) of the form y [y: l^ijI I~* ii|[ci]. By Fact 1, every ti is an 
instance of Uj, hence t = . . . , m„)t for some ground substitution t. If 

there is a production rule ^u^ := f{^ui_„...,^Unj)lc} G Anf{TZc), with 
lMj e (5nf(''^c) \ { L^^^j^''} and r G so/(c), then the following derivation is pos- 
sible: yiy.^uj \- f{yi,...,y„)lyi:^uie^A...Ayn:^UnO_,Ac0l h* t |c'] where 
c' is satisfiable, and t G i(^/NF(^c), l^j)- Assume that for every such produc- 
tion rule, we have r ^ sol{c). It means by construction that there is a rule 
u ^ r|(i] G TZc such that r G solid), hence that t is 7?-c-reducible, a contradic- 
tion. □ 

Using the observation that every ground constructor term is generated by 
^nfC'^^c)) we obtain as a corollary that t G L{Qf^Y(JZc), l^j^'') iff t is TZc- 
reducible. 



5 Inference System 

In this section, we present an inference system for our inductive theorem prov- 
ing procedure. Let us first summarize the key steps of our procedure with the 
following pseudo-algorithm^. The complete inference system, introduced by the 
examples of Section 3, is presented in details in Subsections 5.2, 5.3 and 5.4. 

We start with a conjecture (goal) G (a constrained clause) and a rewrite 

system (with conditions and constraints) TZ, with a subset TZc of constructor 
constrained (unconditional) rewrite rules. 

® Note that it is only a simplified version of the procedure, for presentation purpose, 
in order to give an intuition of how the procedure operates. 
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1. compute the constrained tree grammar Gnf{T^c) 

2. given a goal (or subgoal) C, generate instances of C by using the production 
rules of Gnf{T^c)- We obtain Ci, . . . , C„. 

3. for each Ci, do: 

(a) if Ci is a tautology or d is a constructor clause and can be detected as 
inductively valid then delete it 

(b) else if we are in one of the two following cases: 

i. Ci is a constructor clause and is reducible using TZc, or 

ii. Ci contains a non-constructor symbol and is reducible using TZ and 
induction hypotheses 

then reduce Ci into C'^ 

(c) else disproof (the initial conjecture is not an inductive theorem) 

4. if 3 did not fail then C becomes an induction hypothesis 

5. for each C^, do: 

(a) if C- is a tautology or it is a constructor clause and can be detected as 
inductively valid or it is subsumed by an axiom or induction hypothesis 
then delete it 

(b) otherwise C- becomes a new subgoal, go to 2. 

If every subgoal is deleted, then G is an inductive theorem of TZ. The procedure 
may not terminate, and in this case appropriate lemmas should be added by the 
user in order to achieve termination. 

The deletion criteria (steps 3a and 5a) include tautologies, forward subsump- 
tion, clauses with an unsatisfiable constraint, and constructor clause and can be 
detected as inductively valid, under some conditions defined precisely below. The 
procedure for testing these criteria is based on a reduction to a tree grammar 
non-emptiness problem (does there exist at least one term generated by a given 
grammar), using Qnf{TZc)- In particular, it should be noted that we can decide 
validity this way for clauses Ci which are ground irreducible [19, 21] (a notion 
central in inductive theorem proving / proof by consistency). It is possible to 
decide ground irreducibility also by mean of reduction to non-emptiness, fol- 
lowing the lines of [11]. In Section 6, we show how such tests can be achieved 
effectively, providing that TZ is ground confluent, for some classes of tree gram- 
mar with equality and disequality constraints studied in former works [1,8, 14, 
11]. The extension to other kind of constraints (like e.g. ordering constraints) 
requires algorithms for corresponding classes of tree grammars (see discussions 
in Sections 7 and 8). 

The reductions at step 3b are performed either with standard rewriting or 
with ind. contextual rewriting (case 3(b)ii) or by case analysis, {partial splitting 
in case 3(b)i and rewrite splitting in case 3(b)ii). These rules are defined formally 
in Sections 5.2 and 5.3. 

5.1 Induction Ordering 

The inference and simplification rules below rely on an ordering defined on the 
top of the following complexity measure on clauses. 
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Definition 3. The complexity of a constrained clause C [c] is the pair made 
of the two following components: C, ordered by the multiset extension of the 
ordering >e on literals, and the number of constraints da not occurring in c, 
such that there exists I ^ r\d\ G Tie o,nd la is a subterm of C. 

We denote » the ordering on constrained clauses defined as the lexicographic 
composition of the orderings on the two components on the complexities. 

5.2 Simplification Rules for Defined Functions 

Our procedure uses the simplification rules for defined symbols presented in 
Figure 2. The rules in this figure define the relation for simplifying con- 

strained clauses using TZt>, and a given set H of constrained clauses considered 
as induction hypotheses. 

Inductive Rewriting simplifies goals using the axioms oilZ-D as well as instances 
of the induction hypotheses of provided that they are smaller than the goal. 
The underlying induction principle is based on the well-founded ordering 3> 
on constrained clauses This approach is more general than structural induction 
which is more restrictive concerning simplification with induction hypotheses 
(sec e.g. [7]). Inductive Contextual Rewriting can be viewed as a generalization of 
a rule in [29] to handle constraints by recursively discharging them as inductive 
conjectures. Rewrite Splitting simplifies a clause which contains a subterm match- 
ing some left member of rule of TZd . This inference checks moreover that all cases 
are covered for the application of TZ-d, i.e. that for each ground substitution t, 
the conditions and the constraints of at least one rule is true wrt r. Note that 
this condition is always true when is sufficiently complete, and hence that 
this check is superfluous in this case. Inductive Deletion deletes tautologies and 
clauses with unsatisfiable constraints. 



Inductive Rewriting: {CIc]} {C lc\} 

if C 14 -j^ C 14, la > ra and la > Fa 

where p = T I ^ rlcj £ Tlv U {tp \ ip G H and C {cj » tp} 

Inductive Contextual Rewriting: {T ^ C[la] |[c]} {T ^ C[ra] |[c]} 

if 7^ \=x„d r => rcr |c A c'a}, la > ra and {la} >™"' Fa, where T ^ Z ^ r |c'I £ TZv 

Rewrite Splitting: {C[t]p 14} {F.a, ^ C[r,a,]p {c ^ c^a,\} ^^^^ 

if 7^ \=ind Ao-i [ciffil V ... V r„ar, [c„o-„], t > nai and {t} F^ai 
where the TiCr, =^ kai — » riffi {ciaij, i € [l..n] 

are all the instances of rules Fi ^ U ^ n |ci] € TZ-d such that hai = t 

Inductive Deletion: {C |c]} -^^5 ^ if |c] is a tautology or c is unsatisfiable 
Figure 2: Simplification Rules for Defined Functions 



22 



5.3 Simplification Rules for Constructors 

The simplification rules for constructors are presented in Figure 3, they define 
the relation -^c for simplifying constrained clauses using TZc and TZ. 

Rewriting simplifies goals with axioms from TZc- Partial Splitting eliminates 
ground reducible terms in a constrained clause C |c] by adding to C |c] the 
negation of constraint of some rules of TZc - Therefore, the saturated application of 
Partial splitting and Rewriting will always lead to Deletion or to ground irreducible 
constructor clauses. Finally, Deletion and Validity remove respectively tautologies 
and clauses with unsatisfiable constraints, and ground irreducible constructor 
theorems of TZ- 

Rewriting: {C 14} {C [cj} if C (cj C M and C 14 » C 14 

Partial Splitting: {C[la]p 14} {C[ra]p |c A caj,C[la]p Jc A -^c'aj} 
if / ^ r |c'| e TZc, Icr > ra, and neither c'a nor -ic'cr is a subformula of c 

Deletion: {C |c]} — >c if C7 [c] is a tautology or c is unsatisfiable 

Validity: {C14} 

if C 14 is a ground irreducible constructor clause and TZ \=xnd G [c] 
Figure 3: Simplification Rules for Constructors 

5.4 Induction Inference Rules 

The main inference system is displayed in Figure 4. Its rules apply to pairs 
{£, Ti) whose components are respectively the sets of current conjectures and of 
inductive hypotheses. Two inference rules below. Narrowing and Inductive Nar- 
rowing, use the grammar ^nf('^c) for instantiating variables. In order to be 
able to apply these inferences, according to the definition of term generation 
in Section 4.1, we shall initiate the process by adding to the conjectures one 
membership constraint for each variable. 

Definition 4. Let C Jc] he a constrained clause such that c contains no mem- 
bership constraint. The decoration o/C|c], denoted decorate{C \c\) is the set 
of clauses C |[c A xi:^ui_^ A . . . A a;„: where {xi, . . . ,a;„} = var{C), and for 

all i S [l..n], L^ijj € <3nf('^c) o-nd sort{ui) = sort{xi). 

The definition of decorate is extended to set of constrained clauses as expected. 
A constrained clause C |c] is said decorated if c = d A Xi: ^wij A . . . A ^u„_, 
where {xi, . . . ,Xn} = var{C), and for all i G [l..n], ^Ui^ G Qf^riTZc), sort{ui) = 
sort{xj), and d does not contain membership constraints. 

Simplification, resp. Inductive Simplification, reduces conjectures according to 
the rules of Section 5.3, resp. 5.2. Inductive Narrowing generates new subgoals by 
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Simplification: " ^ if {C [d} ^ 

Inductive Simplification: ^^^j'^^^^^ if {CM} £' 

M . {£\j{Clc\],H) 
Narrowing- 



{£ U £ilJ . . .VJ £n,n\J {C IcD) 
if {Ci |ci]]} ~*e £i, where {Ci |[ci|, . . . ,C„ |c„|} is the set of all clauses such that 
C 14 h' a laj and d{Ci) - d{C) < d{TZ) - 1 

(£U{CM},H) 
Inductive Narrowing: 



{£u£iU...U£n,HU{C Ic]}) 

if {Ci [ci]} -TUmJicWTj, ^h®''® {^^i I^il' • • • ' C'" Icn]} is the set 
of all clauses such that C |c] I-+ d |ci] and d{Ci) - d{C) < dill) - 1 

Subsumption: j \}}' — ^ if C |c] is subsumed by another clause oiTlL)£ UH 

(t,n) 

Disproof: j- — \}}^ — ^ if no other rule applies to the clause C [c] 

Figure 4: Induction Inference Rules 



application of the production rules of the constrained grammar Gtsif{T^c) until 
the obtained clause is deep enough to cover left-hand side of rules of TZj). Each 
obtained clause must be simplified by one the rules of Figure 2 (otherwise, if 
one instance cannot be simplified, then the rule Inductive Narrowing cannot be 
applied). For sake of efficiency, the application can be restricted to so called 
induction variables, as defined in [2] (sec Section 3.3) while preserving all the 
results of the next section. Narrowing is similar and uses the rules of Figure 3 
for simplification. This rule permits to eliminate the ground reducible construc- 
tor terms in a clause by simplifying their instances, while deriving conjectures 
considered as new subgoals. The criteria on depth is the same for Inductive Nar- 
rowing and Narrowing and is a bit rough, for sake of clarity of the inference rules. 
However, in practice, it can be replaced by a tighter condition (with, e.g., a dis- 
tinction between TZc and TZ-d) while preserving the results of the next section. 
Subsumption deletes clauses redundant with axioms of TZ, induction hypotheses 
of H and other conjectures not yet proved (in £). 



5.5 Soundness and Completeness 

We show now that our inference system is sound and refutationally complete. 
The proof of soundness is not straightforward. The main difficulty is to make sure 
that the exhaustve application of the rules preserve a counterexample when one 
exists. We will show more precisely that a minimal counterexample is preserved 
along a fair derivation. 
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A derivation is a sequence of inference steps generated by a pair of the form 
(fo)0)) using the inference rules in J, written {£o,^) ■■■ It is 

called fair if the set of persistent constrained clauses (Ui r\j>i £j) is empty or 
equal to {-L}. The derivation is said to be a disproof in the latter case, and a 
success in the former. 

Finite success is obtained when the set of conjectures to be proved is ex- 
hausted. Infinite success is obtained when the procedure diverges, assuming fair- 
ness. When it happens, the clue is to guess some lemmas which are used to 
subsume or simplify the generated infinite family of subgoals, therefore stopping 
the divergence. This is possible in principle with our approach, since lemmas can 
be specified in the same way as axioms are. 

Theorem 1 (Soundness of successful derivations). Assum,e that TZc is 
terminating and that TZ is sufficiently complete. Let Vq be a set of uncon- 
strained clauses and let Sq = decorate{'Do) . If there exists a successful derivation 
(fo,0) (fi,Wi) hi ■•• thenn\=indVo. 

Proof. Assume that TZ y^xnd T^o, and let {So, 9) hj {£i,Hi) hj • • • be an arbi- 
trary successful derivation. By the following Fact, we have that TZ Y^jnd £o- 

Fact 2 Assume that TZc is terminating and that TZ is sufficiently complete. If 
TZ \=ind £o then TZ \=ind 'Do- 

Proof. Assume that TZ \=ind £o and that for some clause C G X'o we have 
TZ y=ind C. Let {C |ci], . . . , C |c„]} = decorate{C). For all i e [l..n], we have 
TZ \=xnd C |cj], but there exists a ^ U"^;^so/(cj) such that TZ y= Ca. Since TZ is 
sufficiently complete and TZc is terminating, we can rewrite a into a constructor 
and T^c-irreduciblc ground substitution a' . By Lemma 1, it follows that a' € 
sol{ci) for some i £ [l-.n], and therefore that TZ \= Ca', a contradiction with 
TZ ^ind Ca. □ 

Let Do be a clause, minimal wrt in the set: 

|D(T I D Jd] G Lii£i,a G sol{d) is constructor and irreducible and TZ ^ Da} 

Note that such a clause exists since we have proved that TZ y^xnd £o- Let C|c] 
be a clause of UiSi minimal by subsumption ordering and 6 G sol{c), irreducible 
and constructor ground substitution, be such that CO = Dq. 

We show that whatever inference, other than Disproof, is applied to C |c], a 
contradiction is obtained, hence that the above derivation is not successful. 

Inductive Narrowing. Suppose that the inference Inductive Narrowing is applied to 
C |c]. By hypothesis, C has been decorated, i.e. c = dAxi:^u\j A . . . A ^Unj 
with {.Ti, . . . , Xn} = var{C) and for all i G ^Ui^ G QnfC'^c)- Hence, since 

6 G sol{c), there exists a and r such that 9 = ar and C |c] l-+ Ca |c']. 

CJc](T cannot be a tautology and c cannot be unsatisfiable and therefore the 
rule Inductive Deletion cannot be applied. 



25 



Let C be the result of the appUcation of the rule Inductive Rewriting to Ca \c.'\. 
The instances of clauses oiTLU £U {C} used in the rewriting step are smaller 
than CO wrt and therefore, they are inductive theorems of TZ. Hence 
TZ ^ C't. Moreover, CO » C't and C € Uifi, which is a contradiction. 

With similar arguments as above, we can show that the rule Inductive Contextual 
Rewriting cannot be apphed to Ccrfc']. 

Assume that the rule Rewrite Splitting is applied to C\t]p(j |c']. Let 

{A ^h^n [ci], ... , r„ ^ z„ ^ r„ |[c„]} 

be the non-empty subset of 7^x> such that for all i in [l..n], t = liCJi and 

\=ind riai [c' A ciCTi] V ... V r„(T„ |c' A c„a„j 

The result of the application of Rewrite Splitting is: 

{Acti C[riai]p Ic' A ckti], . . . , r„£7„ ^ C[r„(7„]p |[c' A c„£7„]} 

Then there exists k such that TZ \= FkatS for some S G Sol{c' A Ckcrt)- Let 
Cfe = AcO-fc ^ C[rfcCTfc]p |c' A CfcCTfc], we have TZ y= Cfe(5, since Tl |= rk(Jk5, 
\= t5 = rk(JkS, and 7?. ^ C^. On the other hand, CO ^ CkS since 
{f} r^ak, and t > rfecjfe. This contradicts the minimality of CO. 

Narrowing, Inductive Simplification and Simplification. These cases are similar to 
the previous one. 

Subsumption: Since TZ ^ CO, C |c] cannot be subsumed by an axiom of TZ. If 
there exists C" |c'] G H U (f \ {C |c]}) such that C H = C'6 (c'Sj V D, then we 
have TZ ^ C"(50 {0 G so?(c')). Hence, r = and 5 = 0, since C |c] is minimum in 
UiSi wrt subsumption ordering. Therefore, C ^ {£ \ {C}). Moreover, C ^ Ti., 
otherwise the inference Inductive Narrowing or Narrowing could also be applied 
to C|c], in contradiction with previous cases. Hence, Subsumption cannot be 
applied to C |c] . □ 

Since there are only two kinds of fair derivations, we obtain as a corollary: 

Corollary 1 (Refutational completeness). Assume that TZc is terminating 
and that TZ is sufficiently complete. Let Vq he a set of unconstrained clauses 
and let Sq = decorate{Vo) . IfTZ ^ind £q, then all fair derivations starting from 
(fo,0) end up with (_L,W). 

When wc assume that all the variables in goals arc decorated (restricting the 
domain for this variables to ground constructor irreducible terms), the above 
hypotheses that TZc is terminating and TZ is sufficiently complete can be dropped. 
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Theorem 2 (Soundness of successful derivations). Let So be a set of 

decorated constrained clauses. If there exists a successful derivation (5o)0) l~x 
{£i,Hi) hi • • • then Tl \=ind So- 
Proof. The proof is the same as for Theorem 1 except that we do no need the 
Fact 2 since the goals of £q are already decorated. Hence we do neither need 
the hypotheses that TZc is terminating and that TZ is sufficiently complete which 
where only used for the proof of Fact 2. □ 

As a consequence, of the above theorem, we immediately have the refuta- 
tional completeness of our inference system if the goals are decorated constrained 
clauses. 

Corollary 2 (Refutational completeness). Let £q be a set of decorated con- 
strained clauses. If TZ ^xnd £o, then all fair derivations starting from (So, 9) end 
up with (_L,W). 

We shall see in Section 7 some example of applications of Theorem 2 and Corol- 
lary 2 to specifications which are not sufficiently complete. 

Our inference system can refute false conjectures. This result is a consequence 
of the following lemma. 

Lemma 2. let {Si, Hi) hj {Si+i, Hi+i) be a derivation step. IfTZ \=xnd Si LiHi 
then TZ \=xnd ^i+i U Hi+i. 

Proof. Let C|[c] be a clause in Si and {Si U {C|c]},Wi) hx (£i+i,Wi+i) be a 
derivation step obtained by the application of an inference to C Jc] and assume 
that TZ \=ind SiUHi. By hypothesis, the instances of clauses of H U f U {C |c]} 
which are used during rewriting steps, are valid. Hence, we can show that TZ \=ind 
Si+i U Tii+i by a case analysis according to the rule applied to C |[c] . □ 

The following lemma is also used in the proof of soundness of disproof. 

Lemma 3. If TZ is ground confluent and sufficiently complete then for every 
constructor clause C [c] , if TZ \=ind C |[c] then TZc \=ind C |[c] . 

Proof. Let r e sol{c) be a substitution grounding for C. By the sufficient com- 
pleteness of TZ, wc may assume without loss of generality that r is a constructor 
substitution. By hypothesis, TZ \= Ct. Assume that for some literal u = v C, 
we have TZ\= ut = vt. Since TZ is ground confluent, it means that ut Itz vt, and 
hence that ut l-ji^ vt, i.e. TZc H = because ut,vt G ^(C). Moreover, if 
TZ\= UT VT then TZc \= ut ^ vt because TZc QTZ. □ 

Theorem 3 (Soundness of disproof). Assume that TZ is strongly com- 
plete and ground confluent. If a derivation starting from {Sq, 0) returns the pair 
(_L,W), then TZ Y=xnd So- 
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Proof. Under our assumptions, there exists a step k in the derivation, such that 
Disproof appUes to a constrained clause C Jc] in £k- 

We prove first that C |c] is a constructor clause. Assume indeed that C |c] 
contains a term of the form f{ti, . . . ,tn), where f & V and for all i G [l.-n], 
ti E T{C. X). The constraint c is satisfiable, otherwise Inductive Deletion could 
be applied. Let r G sol{c). Hence by Lemma 1, for each x e var{C), xt is in 
T^C-normal form. We have now two possibilities: 

1. for one i G Ut is reducible. In this case, there exists a substitution 
tr such that t = aO and ti Jc] h+ tiU \c'\ and tjcr contains as a subtcrni 
an instance of a left-hand side of rule of Tic- Therefore, either Rewriting or 
Partial Splitting can be applied to Ua |c']. It implies that Narrowing can be 
applied to C |c], which is a contradiction. 

2. every fjT is irreducible. The term f{ti, . . . ,tn)T is reducible at root position 
because / is strongly complete wrt TZ. Then there exists a such that r = 
a9 and /(ti, . . . , t„) H h+ f{ti, tn)a Jc'] and moreover /(ti, . . . , f„)cr 
is an instance of a left-hand side of rule of TZ-v- Therefore, either Inductive 
rewriting or Rewrite Splitting can be applied. Indeed the application condition 
of the latter inference is a consequence of the strongly completeness of TZ. 
Hence, the inference Inductive Narrowing can be applied to C |c], which is a 
contradiction. 

In conclusion, the clause C [c] contains only constructor terms. 

Then, we deduce that C Jc] contains ground irreducible terms only, otherwise 
Narrowing would apply. Since Validity does not apply either, C |c] is not an 
inductive consequence of TZc- By lemma 3, and since 1Z is ground confluent, 
we conclude that C |c] is not an inductive theorem of TZ. As a consequence, 
TZ ^ind ^k- Finally, by lemma 2, we deduce that TZ ^xnd ^o- ^ 

5.6 Handling Non- Terminating Constructor Systems 

Our procedure applies rules of TZc ^nd TZ-p only when they reduce the terms wrt 
the given simplification ordering >. This is ensured when the rewrite relation 
induced by TZc and TZxi is compatible with >, and hence that TZc and TZ-p are 
terminating (separately), like in the example of Section 3. Note that this is in 
contrast with other procedures like [7, 2] where the termination of the whole 
system TZ is required. 

If TZc is non-terminating then one can apply e.g. the constrained completion 
technique [23] in order to generate an equivalent orientable theory (with order- 
ing constraints). The theory obtained (if the completion succeeds) can then be 
handled by our approach. 

Example 2. Consider this non-terminating system for sets: 



Applying the completion procedure we obtain the constrained system of Sec- 



ins{x, ins{x, y)) 
ins{x, ins{x' , y)) 



ins{x, y) 

ins{x' , ins{x, y)) 



tion 3. 



O 
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6 Decision Procedures for Conditions in Inference Rules 



We present a reduction of the conditions in the inference rules of Figures 2, 3, 
and 4 to emptiness decision problems for tree automata with constraints. We 
deduce a decision procedure for these tests in the case where the constraints in 
the specification are limited to syntactic equality and disequality. 

We assume here that, like in Theorem 1, the inference system is applied to a 
set decorate {T>o) where Vq is a set of unconstrained clauses. 

6.1 Reductions 

Consider the following decision problems, given two constrained grammars G, Q' 
and two non terminals l^j, l.u'_, of respectively G and G', 

(ED) emptiness decision: L{G, ^u_,) =%! 

(EI) emptiness of intersection: L{G, l^j) H L{G' , l^'j) = 0? 

Ground instances. Let t |c] be a constrained term (or clause) such that 

the constraint c has the form xi. ^ui^ /\ . . . A Xm- ^Umj Ad where d contains 
no membership constraints. Note that starting with decorated clauses, any 
goal or subgoal occurring during the inference is of the above form. The set 
of ground instances of t satisfying c is recognized by a constrained grammar 
G{t |c]) = [Q{t Jc]), A{t |c])) whose construction is described in Figure 5. 
For technical reasons concerning non-terminals separation, we use in the con- 
struction of G{t |c]) a relabeling isomorphism ° from the signature {S,J^) to the 
signature {S°,J^°), such that the function symbol f° has profile 5^ x . . . x 5*° — *■ 
S° if / has profile SiX . . .xS„ ^ S, and its extension from T{J^, X) to T{J^°, X), 
such that (recursively) f{ti, . . . , f„) = /"(tj, . . . , t° ), and for each x G X, x° = x. 





Ad]) = QNF(7^c) U { \u<t} 






Adl) contains all the production rules of A!^p{TZc) 


plus: 




Li°j ■■= giJij,---, lCJM, iit^ g{ti,...,tm 


) 


and every ^/"(i-i, . 


■ := /(lSij, ■ ■ ■ , lSuj) [ I such that /(wi, . . . 


,Mn) < t. 


and Vj < in a Vj 


= Xi for some i, then ^Sj_^ = ^Ui_, 




if u° 


e A" \ {xi,. . . , Xm} then ^Sj^ € QnyCR-c) 




if«° 


4 X then Si = v°, 





Figure 5: Constrained Grammar t/(i, c) Ground instances 



Lemma 4. L(0(i|c]), J,^) = {ta | cr\^ar{c) G sol{c)}. 

Proof. The proofs of both directions C are straightforward inductions rcsp. on 
the length of a derivation of a term of L{G{t |c]), ^t_,) and on a ground instance 
ta such that a\yar{c) is a solution of c. □ 



29 



Constraints unsatisfiability. This property is required for rules Induc- 
tive Rewriting, Inductive Contextual Rewriting, Rewrite Splitting, Inductive Deletion, 
Deletion, and Subsumption. 

Lemma 5. Given a constraint c, there exists a constrained grammar G{c) such 

that c is unsatisfiable iff L{Q{c)) = 0. 

Proof. Let Xi , . . . , Xm be the hst of ah the variables occurring in c, eventu- 
ally with repetition in case of multiple occurrences. Let yi,---,ym be a list 
of fresh distinct variables, let be a new function symbol of arity m and 
let c — Ai" 1 ?A ~ ^i- The constrained grammar G{c) is defined by G{c) = 
g{f"^{y,,...,y,n)lcAdl). □ 

Corollary 3. Constraints unsatisfiability is reducible to (ED). 

Ground (ir)reducibility. The rules Validity, hence Simplification, and Disproof 

(by negation) check ground irreducibility. 

Lemma 6. Ground reducibility and ground irreducibility decision are reducible 
to (EI). 

Proof. By definition and Lemmas 1 and 4, a constrained clause C |c] is ground 
reducible iff L(C?(C|c])) n L{g^F{nc),QNF{nc) \{^x^^''}) = and ground 
irreducible iff L{g{C |c])) n ^g^piTle), ^a;?"^) =0. □ 

Validity of ground irreducible constructor clauses. The rule Validity, 
hence Sinnplification, checks this property. 

Lemma 7. When TZ is ground confluent, validity of ground irreducible construc- 
tor constrained clauses is reducible to (ED). 

Proof. Let C |c] be a ground irreducible constructor constrained clause. Let C 
be the constraint obtained from C by replacement of every equation s = t (resp. 
disequation s t) hy the atom s fa t (resp. s ^ t). Since C |c] is ground 
irreducible and TZ is ground-confluent, we have that C |c] is valid in the initial 
model of TZ iff every substitution a e sol{c) grounding for C is such that a e 
sol{C). This is equivalent to L{g{C {c A -.C])) =0. □ 

6.2 Decision 

It remains to give decision procedures for (ED) and (EI). We proceed by reduc- 
tion to analogous problems on tree automata with (dis)equality constraints [10], 
for a class of tree grammars defined as follows. 

Definition 5. A constrained grammar g is called normalized if for each of 

its productions ^tj := f{^Uij,..., L^nj) [c] all the atomic constraints in c 
have the form P{si, . . . , Sk) where P G C and si,...,Sk are strict subterms 
of f{ui, . . .,Un). 
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Every normalized constrained grammar which contains only constraints with ss, 
9^ in its production rules is equivalent to a tree automaton with equality and 
disequality constraints (AWEDC), see [10] for a survey. Therefore, constrained 
grammars inherit the properties of AWEDC concerning emptiness decision, and 
(ED), (EI) are decidable for a normalized constrained grammar when for each 
production ^tj := fi^Ui^, ^u^^) fc]: 

1. the constraints in c have the form Uj « Uj or Ui 76 Uj [1], 

2. the constraints in c are only disequalities s\ 56 S2 [11], 

3. the constraints in c arc equalities and disequalities, and for every (ground) 
constrained term t |c] generated by Q, for every path p G Vos{t), the number 
of subterms s occurring along p in t and such that s « s' or s' w s is an 
atomic constraint of c is bounded (independently from t and c) [14], 

4. the constraints in c are equalities and disequalities, and for every (ground) 
constrained term t [c] generated by Q, for every path p G Vos^t), the number 
of subterms s satisfying the following conditions (i-iii) is bounded (indepen- 
dently from t and c) [8] 

(i) s occurs along p in t, 

(ii) s « s' or s' « s is an atomic constraint of c, 

(iii) s, s' are not brothers in a subterm /(..., s, s', .• ■) occurring on p. 

Theorem 4. All the conditions of the simplification rules in Figures 2,3 and the 
inference rules in Figure 4 are decidable or make recursive call to the procedure 
itself when TZ is ground confluent and, for all I ^ r |c] G TZc, for all s « s' G c, 
(resp. all s ^ s' & c) s and s' are either variables or strict subterms of I (resp. 
variables or strict subterms occurring at sibling positions in I). 

Proof. When the constraints of TZc fulfill the above conditions, then Gnf{T^c) is 
in category 4, hence (ED) and (EI) arc decidable. Hence the conditions in the 
inference and simplification rules in Figures 2,3,4 which are not recursive call, 
are decidable by Corollary 3 and Lemmas 6,7. □ 

The algorithms provided in the literature for the emptiness decision for the 
classes 1 to 4 of tree automata with equality and discquality constraints are 
all very costly, due to the inherent complexity of the problem. For instance, 
for the "easiest" class 1, the problem is EXPTIME-complete [10], see also [21, 
11] concerning class 2. The problem is however less difficult for deterministic 
automata {e.g., PTIME for class 1), like the one of Figure 1. 

Cleaning algorithms, which may behave better in the average, have been pro- 
posed [8] for optimizing emptiness decision. An interesting aspect of the cleaning 
algorithm is its monotonicity: an incremental change on the automaton in input 
causes only an incremental change of the intermediate structure constructed by 
the algorithm for emptiness decision. This should permit to reuse such struc- 
tures in our setting because all the constrained grammars of Section 6.1 are 
incrementally obtained from the unique normal form grammar Q^f{TZc). 

Another promising approach for implementation is the use of first-order sat- 
uration techniques. It has been studied for solving various decision problem for 
several classes of tree automata with or without constraints [18, 13, 17]. 



31 



7 Handling Partial Specifications 



The example of sorted lists in Section 3 can be treated with our procedure 
because it is based on a sufficiently complete and ground confluent conditional 
constrained TRS TZ whose constructor part TZc is terminating. Indeed, under 
these hypotheses, Theorem 1 ensures the soundness of our procedure for proving 
inductive conjectures on this specification, and Corollary 1 and Theorem 3 ensure 
respectively rcfutational completeness and soimdness of disproof. 

For sound proofs of inductive theorems wrt specifications which are not suf- 
ficiently complete, we can rely on Theorem 2 and Corollary 2 which do not 
require sufficient completeness of the specification but instead suppose that the 
conjecture is decorated, i.e. that each of its variables is constrained to belong to 
a language associated to a non-terminal of the normal- form (constrained) gram- 
mar. In this section, wc propose two applications of this principle of decoration 
of conjectures to the treatment of partial specifications. We treat the case where 
the specification of defined function is partial in Section 7.1, and the case where 
axioms for constructors are partial in Section 7.2. 

7.1 Partially Defined Functions 

Under the condition that the conjecture is decorated, extending a given suffi- 
ciently complete specification with additional axioms for defining partial (de- 
fined) functions preserves successful derivations. 

Theorem 5. Assume that TZ is sufficiently complete and let TZ' be an consistent 
extension of TZ where TZc' = T^C and TZv' = TZ-p UTZti" (TZt>" defines additional 
partial defined functions). LetSo be a set of decorated constrained clauses. Every 
derivation {So, 0) hi ■ • • successful wrt TZ is also a successful derivation wrt TZ'. 

Proof. The grammars t/NF('^c') and Gnf{TZc) are the same. Therefore every 

inference step wrt TZ is also an inference step wrt TZ' . □ 

We apply Theorem 5 to a partial extension of the specification of Section 3. 

Specification of min for sorted lists. Let us complete the specification of 
Section 3 with a new defined symbol min : Set Nat and the following rules of 
TZv: 

min{ins{x, 0)) — » x 
min{ins{x, ins{y, z))) — > min{ins{x, z)) [a; -< yj 

The function min is not sufficiently complete wrt TZ (the case mm(0) is missing). 

Proof of two conjectures for min. We shall prove, using our inference sys- 
tem, that the two following constrained and decorated conjectures are inductive 
theorems of TZ. 

min{ins{x, ins{y, z))) min{ins{y, z)) y A x,y: ^x^^^ A z: l^^j^'I (12) 

min{ins{x, ins{y, z))) — > min{ins{y, z)) Ix y A x,y: ^Xj^^ A z: ^ins{xi,X2)_,} 

(13) 
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Let us now prove that the conjecture (12) is an inductive theorem of TZ. We 
start by the simplification of (12) using a Partial Splitting. We obtain: 

min{ins{y, z)) = min{ins{y, z))lx Ki y A x y A x,y: ^x^^^ A z: ^x^^^ (14) 

min{ins{x, ins{y, z))) = min{ins{y, z))\x y A x y A x,y: ^x^^^ A z: ^x^^\ 

(15) 

The clause (14) is a tautology. Subgoal (15) is simplified using Partial Splitting 
again. We obtain: 

min{ins{y, ins{x,z))) = min{ins{y, z)) 

Ix y y Ax y Ax y Ax,y: ^x^^^ A z: ^x^^^ (16) 

min{ins{y,ins{x,z))) = min{ins{y,z)) 

Ix y A X >p= y A X ii y A x,y: ^x^^^ A z: ^x^^} (17) 

Subgoal (16) is simplified by TZx) into min{ins{y, z)) = min{ins{y,z)), a tautol- 
ogy. Subgoal (17) can also be deleted since the constraint x ^ y, x ^ y, x 56 t/ is 
unsatisfiablc This ends the proof that (12) is an inductive theorem of TZ. 
The proof of (13) follows the same steps. 

Note that by Theorem 5 the proofs of the decorated conjectures (l.a), (l.b) 
and (2. a), (2.b) in Section 3 remain valid for the above extended specification. 

7.2 Partial Constructors cind Powerlists 

The restriction to decorated conjectures also permits to deal with partial con- 
structor functions. In this case, we are generally interested in proving conjectures 
only for constructor terms in the definition domain of the defined function (well- 
formed terms) . This is possible with our procedure when TZc is such that the set 
of well-formed terms is the set of constructor 7?.c-normal forms. Hence, decorat- 
ing the conjecture with grammar's non- terminals, as in Theorem 2, amounts in 
this case at restricting the variables to be instantiated by well-formed terms. 

We illustrate this approach in this section with an example of application of 
Theorem 2 to a non complete specification of powerlists. 

Specification of powerlists. A powerlist [24] is a list of length 2" (for n > 

0) whose elements are stored in the leaves of a balanced binary tree. Kapur 
gives in [20] a specification of powerlists and some proofs of conjectures with 
an extension of RRL mentioned in introduction. This example is carried out 
with an extension of the algebraic specification approach where some partial 
constructor symbols are restricted by application conditions. We propose below 
another specification of powerlists which contains only constrained rewrite rules, 
and which can be efficiently handled by our method. 

We consider a signature for representing powerlists of natural numbers, with 
the sorts: <S = {Nat, List} and the constructor symbols: 

C = {0 : Nat, s : Nat ^ Nat, u : Nat ^ List, tie : List List, _L : List} 



33 



The symbols and s are used to represent the natural numbers in unary notation, 
V creates a singleton powerlist v{n) of length 1 from a number n, and tie is 
the concatenation of powerlists. The operator tie is restricted to well balanced 
constructor terms of the same depth. In order to express this property, we shall 
consider a constructor rewrite system TZc which reduces to _L every term tie{s, t) 
which is not well balanced. This way, only the well defined powerlists are TZq- 
irreducible. For this purpose, we shall use a new binary constraint predicate ~ 
defined on constructor terms of sort List as the smallest equivalence such that: 

v{x) ~ v{y) for all x, y : Nat 
tie{xi,X2) ~ tie{yi,y2) iff a;i ~ a;2 ~ 2/i ~ 2/2 

The constructor TRS TZc has one rule constrained by ~: 

tie{yi,y2) ^ -Llyi 7^ y2j tie{±,y) ^ ± tie{y,±)^± 



Tree grammars with ^-constraints on brother subterms. The normal 
form tree grammar 5nf('^c) associated to TZc generates the well founded ground 
constructor terms. Its non-terminals, according to the construction in Sec- 
tion 4.2, are: l^^j^*, l^^j^*, l-Lj, ^tie{xi,X2)^ and its production rules: 



Jie{xuX2), := t%e{ ^x^^j'^ ^Xi^j^') [4^=' - x'i^'} 

Jie{xi,X2)^ ■■= tie{jie{x3,X4)^, Jie{x5,X6)^) ltie{x3,X4) tie{x5,xe)j 

Note that all the constraints in these production rules are applied to brother 

subterms. Wc have omitted in the above list the non-terminal ^Xj^'^, and pro- 
duction rules of the form: ^x^^'^ := tie{^Xi^'^\ l^2j") bi;'"* 7^ or ^x^^ := 
tie{^±^, ^X2^"^). 

The emptiness problem is dccidablc for such constrained tree grammars. This 
can be shown with an adaptation of the proof in [1] to ~-constraints (instead of 
equality constraints) or also by an encoding into the visibly tree automata with 
one memory of [13]. 

Proof of a conjecture. Wc add to the specification a defined symbol rev: 
V={rev: List -> List} and a defined TRS TZv: 

reu(_L) —> _L (ro) 
rev{v{y)) -> v{y) (ri) 
rev{tie{yi,y2)) -> tie (rev (2/2), rev{yi)) (12) 

The conjecture is: 

rev{rev{x)) = x (18) 

A proof of Conjecture (18) can be found in [20]. We prove (18) by the analysis 
of several cases, where each case is treated quickly. As explained above, we need 
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to decorate its variables with non- terminals of the normal form grammar. There 
are three possibilities: 

rev{rev{x)) = x \x: l^^j'^I (19) 

rev{rev{x)) = x \x: (20) 
rev{rev{x)) = x\x: ^tie{x 1^X2) (21) 

Let us apply the production rules of the grammar to Conjectures (19) 
and (20) (inference Inductive Narrowing). It returns respectively: 

rev{rev{v{x))) = x {x: ^x^^^} (22) 
rev{rev{l.)) = ± (23) 

The subgoals (22) and (23) arc reduced by the rules (ri) and (ro) of 71-d 
(Inductive Rewriting for Inductive Narrowing) into the respective tautologies: 
v{x) = vix) Ix: ^x^^'l and _L = _L. 

Now, let us apply Inductive Narrowing to Conjecture (21). The application of 
the production rules of the grammar Qnf{T^c) returns: 

rev{rev{tie{xi,X2))) = tie{xi,X2) 

|xi: .XgL'^t A X2: ,X4']'^' A x^"' - xV^'l (24) 

rev{rev{tie{xi,X2))) = tie{xi,X2) 

Ixr-^xs^'^^ A X2: Jie{x4, X5)^ Ax^'^*^ ~ tie{x4, X5)} (25) 

rev{rev{tie{xi, X2))) — tie{xi,X2) 

lxi:Jie{x3, Xi)^ AX2: ^a^sj"* A tie{x3, Xi) ~ x^'^^j (26) 

rev{rev{tie{xi , X2))) = Ue{xi,X2) 

lxi:Jie{x3,X4)^Ax2:Jie{x5,X6)^Atie{x3,X4) ~ tie{x5,X6)j (27) 

Note that, with (r2): 

rev{rev{tie{xi, X2))) ^n-v rev{tie(rev{x2), rev{xi))) 

-^n-D tie {rev {rev (xi)), rev {rev {X2))) 

Hence, the reduction of (24) with the rule {(2) of TZ-d gives: 

tie{rev{rev{xi)),rev{rev{x2))) = tie{xi,X2) 

.xa^'^* A X2: ,X4Li=' A 4'=* ~ x^i'^'j (28) 

ans similarly for (25), (26), and (27). 

This later equation (28) can be reduced by Conjecture (21), considered as an 
induction hypothesis (this is a case of Inductive Rewriting), giving the tautology: 

tie{xi,X2)) = tie{xi,X2) fxi: ^x^^'^^ A X2: ^x^^i"' A - 4'='] (29) 

The situation is the same for the other reduced equation and this completes the 
proof of Conjecture (21). 
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8 Conclusion 



A fundamental issue in automatic theorem proving by induction is the compu- 
tation of a suitable finite description of the set of ground terms in normal form, 
which can be used as an induction scheme. Normal form constrained tree gram- 
mars are perfect induction schemes in the sense that they generate exactly the set 
of constructor terms in normal form. At the opposite, test sets and cover sets are 
approximated induction schemes when the constructors are not free. They may 
indeed also represent some reducible ground terms, and therefore may cause the 
failure (a result of the form "don't know" ) of an induction proof when construc- 
tors are not free. In this case, refutational completeness is not guaranteed. This 
explains the choice of constrained grammars for the incremental generation of 
subgoals. Constrained tree grammars are also used (by mean of emptiness test) 
in order to detect in some cases that constructor subgoals are inductively valid. 
Moreover, this formalism permits to handle naturally constraint of membership 
in a fixed regular tree language. 

Our inference system allows rewrite rules between constructors which can be 
constrained. Hence it permits to automate induction proofs on complex data 
structures. It is sound and refutationally complete, and allows for the refu- 
tation of false conjectures, even with constrained constructor rules. Moreover, 
all the conditions of inference rules are either recursive calls to the procedure 
(Rewrite Splitting or Inductive Contextual Rewriting), or either some tests de- 
cidable under some assumptions on the constraints of the rewrite system for 
constructors. These assumptions are required for decision of emptiness of con- 
strained grammar languages. 

Constraints in rules can serve to transform non terminating specifications 
into terminating ones, for instance in presence of associativity and commutativ- 
ity axioms (ordering constraints), define ad- hoc evaluation strategies, like e.g. 
innermost rewriting, directly in the axioms (normal form constraints), or for the 
analysis of trace properties of infinite state systems like security protocols (con- 
straints of membership in a regular tree language representing faulty traces [3] ) . 
The treatment of membership constraints permits to express in a natural way, in 
conjectures, trace properties for the verification of systems. This idea has been 
applied for the validation and research of attacks (by refutation) on security 
protocols in a model with explicit destructor fimctions [3]. These symbols rep- 
resent operators like projection or decryption whose behaviour is specified with 
constructor axioms. 

Our procedure can handle partial specifications: specifications which are not 
sufficiently complete and specifications with partial constructor functions in the 
lines of [20]. Moreover, it preserves the proofs of decorated conjectures made 
in a sufficiently complete specification when this specification is extended with 
partial symbols. 

The definition of tree grammars with constraints in Section 4 is very general. 
It embeds some classes of grammars for which the emptiness problem is decidable 
(see Section 6) and also classes for which this problem is still open. Therefore, 
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advances in tree automata theory can benefit our approach, and we are planing 
to study new classes of tree automata with constraints. 
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